Sequential Tests for Large-Scale Learning
نویسندگان
چکیده
We argue that when faced with big data sets, learning and inference algorithms should compute updates using only subsets of data items. We introduce algorithms that use sequential hypothesis tests to adaptively select such a subset of data points. The statistical properties of this subsampling process can be used to control the efficiency and accuracy of learning or inference. In the context of learning by optimization, we test for the probability that the update direction is no more than 90 degrees in the wrong direction. In the context of posterior inference using Markov chain Monte Carlo, we test for the probability that our decision to accept or reject a sample is wrong. We experimentally evaluate our algorithms on a number of models and data sets.
منابع مشابه
Comparing Bandwidth and Self-control Modeling on Learning a Sequential Timing Task
Modeling is a process which the observer sees another person's behavior and adapts his/her behavior with that which is the result of interaction. The aim of present study was to investigate and compare effectiveness of bandwidth modeling and self-control modeling on performance and learning of a sequential timing task. So two groups of bandwidth and self-control were compared. The task was pres...
متن کاملA Study of the Rockfill Material Behavior in Large-Scale Tests
Inspecting the behavior of the rockfill materials is of significant importance in analysis of rockfill dams. Since the dimensions of grains in such materials are greater than the conventional sizes suitable for soil mechanics tests, it is necessary to experimentally study them in specific large-scale apparatuses. In this research, the behavior of rockfill materials in two large rockfill dams co...
متن کاملSequential-Based Approach for Estimating the Stress-Strength Reliability Parameter for Exponential Distribution
In this paper, two-stage and purely sequential estimation procedures are considered to construct fixed-width confidence intervals for the reliability parameter under the stress-strength model when the stress and strength are independent exponential random variables with different scale parameters. The exact distribution of the stopping rule under the purely sequential procedure is approximated ...
متن کاملDeveloping and validation of metamemory scale for adolescents
The purpose of this study was developing and validating the metamemory scale for adolescents in the academic context. The study was a mixed method research and benefitted from sequential exploratory type which in qualitative stage using triangulation method (aligning multiple data approach) holding four dimensions of a) collecting literature reviews related to metamemoey based on the theoretica...
متن کاملScaling Up Inductive Learning with MassiveParallelismFOSTER
Machine learning programs need to scale up to very large data sets for several reasons, including increasing accuracy and discovering infrequent special cases. Current inductive learners perform well with hundreds or thousands of training examples, but in some cases, up to a million or more examples may be necessary to learn important special cases with conndence. These tasks are infeasible for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Neural computation
دوره 28 1 شماره
صفحات -
تاریخ انتشار 2016